Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-30717: [C#] Add ToString() methods to Arrow classes #36566

Merged
merged 7 commits into from
Oct 30, 2023

Conversation

voidstar69
Copy link
Contributor

@voidstar69 voidstar69 commented Jul 8, 2023

What changes are included in this PR?

Implemented the ToString() method on classes ChunkedArray, Field, RecordBatch, Schema and Table.
I could not find the class DataType mentioned in the issue, perhaps it meant DateType?

Closes #30717.

@github-actions
Copy link

github-actions bot commented Jul 8, 2023

⚠️ GitHub issue #30717 has been automatically assigned in GitHub to PR creator.

@voidstar69 voidstar69 marked this pull request as ready for review July 8, 2023 16:56
@voidstar69 voidstar69 requested a review from westonpace as a code owner July 8, 2023 16:56
Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are helpful but I don't know if they address the heart of the issue. My guess is the issue was asking for some kind of pretty printing support. Do you want to perhaps add a follow-up issue for actual table-printing support? For example, here is what the "to string" looks like in various other implementations:

Python:

pyarrow.Table
strings: string
ints: int32
floats: float
----
strings: [["one","two","three","four","five"]]
ints: [[1,2,3,4,5]]
floats: [[1,2,3,4,5]]

C++

strings: string
ints: int32
floats: float
----
strings:
  [
    [
      "one",
      "two",
      "three",
      "four",
      "five"
    ]
  ]
ints:
  [
    [
      1,
      2,
      3,
      4,
      5
    ]
  ]
floats:
  [
    [
      1,
      2,
      3,
      4,
      5
    ]
  ]

Rust:

RecordBatch { schema: Schema { fields: [Field { name: "strings", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "ints", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "floats", data_type: Float64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, columns: [StringArray
[
  "one",
  "two",
  "three",
  "four",
  "five",
], PrimitiveArray<Int32>
[
  1,
  2,
  3,
  4,
  5,
], PrimitiveArray<Float64>
[
  1.0,
  2.0,
  3.0,
  4.0,
  5.0,
]], row_count: 5 }

csharp/src/Apache.Arrow/RecordBatch.cs Outdated Show resolved Hide resolved
@voidstar69
Copy link
Contributor Author

These are helpful but I don't know if they address the heart of the issue. My guess is the issue was asking for some kind of pretty printing support. Do you want to perhaps add a follow-up issue for actual table-printing support? For example, here is what the "to string" looks like in various other implementations:

Python:

pyarrow.Table
strings: string
ints: int32
floats: float
----
strings: [["one","two","three","four","five"]]
ints: [[1,2,3,4,5]]
floats: [[1,2,3,4,5]]

C++

strings: string
ints: int32
floats: float
----
strings:
  [
    [
      "one",
      "two",
      "three",
      "four",
      "five"
    ]
  ]
ints:
  [
    [
      1,
      2,
      3,
      4,
      5
    ]
  ]
floats:
  [
    [
      1,
      2,
      3,
      4,
      5
    ]
  ]

Rust:

RecordBatch { schema: Schema { fields: [Field { name: "strings", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "ints", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "floats", data_type: Float64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, columns: [StringArray
[
  "one",
  "two",
  "three",
  "four",
  "five",
], PrimitiveArray<Int32>
[
  1,
  2,
  3,
  4,
  5,
], PrimitiveArray<Float64>
[
  1.0,
  2.0,
  3.0,
  4.0,
  5.0,
]], row_count: 5 }

I see. I thought the original issue might have been about making these types easier to comprehend in a debugger, such as in Visual Studio. There the debugger invokes ToString on an object, which usually returns a single line of text which the debugger then displays. I think this current PR achieves that goal, although I cannot know the intent of the original issue.

I have now created issue #36777 for adding pretty printing methods.

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jul 19, 2023
@voidstar69 voidstar69 requested a review from westonpace July 19, 2023 22:01
Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it makes sense to put pretty printing somewhere else other than ToString. One more tiny change if you don't mind. Also, could you briefly add some calls to ToString in the unit tests for these types? You probably don't need to create a new test but just something to make sure it converts to string without triggering an exception.

csharp/src/Apache.Arrow/Schema.cs Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Jul 20, 2023
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jul 23, 2023
@voidstar69 voidstar69 requested a review from westonpace July 23, 2023 06:39
@voidstar69
Copy link
Contributor Author

I agree it makes sense to put pretty printing somewhere else other than ToString. One more tiny change if you don't mind. Also, could you briefly add some calls to ToString in the unit tests for these types? You probably don't need to create a new test but just something to make sure it converts to string without triggering an exception.

@westonpace, I think this PR is now ready for your review. Hopefully we can merge this PR soon. Thanks.

@voidstar69
Copy link
Contributor Author

@westonpace are you able to review this PR now? Or is there somebody else who can do this? Thanks.

@voidstar69
Copy link
Contributor Author

@westonpace

@voidstar69
Copy link
Contributor Author

@davidhcoe are you or another maintainer available to review this PR please? This PR has not received a further review since @westonpace reviewed it in July. I have held off from creating further PRs due to this.

@CurtHagenlocher CurtHagenlocher merged commit 0026c0c into apache:main Oct 30, 2023
8 checks passed
@CurtHagenlocher CurtHagenlocher removed the awaiting change review Awaiting change review label Oct 30, 2023
@CurtHagenlocher
Copy link
Contributor

Sorry, meant to look last week but I couldn't figure out how to re-run the failed appveyor build. As I can't imagine it's related, I'll ignore it and cross my fingers... .

@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 0026c0c.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

@voidstar69 voidstar69 deleted the 30717 branch October 30, 2023 22:31
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…36566)

### What changes are included in this PR?

Implemented the ToString() method on classes ChunkedArray, Field, RecordBatch, Schema and Table. 
I could not find the class DataType mentioned in the issue, perhaps it meant DateType?

Closes apache#30717.
* Closes: apache#30717

Lead-authored-by: Gavin Murrison <[email protected]>
Co-authored-by: voidstar69 <[email protected]>
Co-authored-by: Weston Pace <[email protected]>
Signed-off-by: Curt Hagenlocher <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…36566)

### What changes are included in this PR?

Implemented the ToString() method on classes ChunkedArray, Field, RecordBatch, Schema and Table. 
I could not find the class DataType mentioned in the issue, perhaps it meant DateType?

Closes apache#30717.
* Closes: apache#30717

Lead-authored-by: Gavin Murrison <[email protected]>
Co-authored-by: voidstar69 <[email protected]>
Co-authored-by: Weston Pace <[email protected]>
Signed-off-by: Curt Hagenlocher <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C#] Add ToString() methods to Arrow classes
3 participants